[1] "Evaluating clusters for Yakima Canyon"
[1] "Subset by Biological Year: TRUE"
[1] "Biological Year: 2022"
[1] "Subset by date range: FALSE"
[1] "From table: AnimalID_GPS_Data_AllCollars_2023_04_18"
First step is to evaluate any social organization of collared animals by looking at their GPS locations. Some herds have more obvious population substructure than others. This is a pretty basic analysis but might be helpful for some high-level grouping. However, it may not work really well for animals that show multi-modal type distribution patterns in their location data (i.e. multiple distinct activity centers).
To do this, we’ll read data from the data base, subset to our specified parameters (bio year, date range) and then compute the median location for each animal over the time period of interest.
Next we’ll then compute a distance matrix based off median locations and apply a hierarchical clustering method using ‘hclust’ and the function ‘cutree’ to determine the number of clusters based off the height input to ‘cutree’. The ‘height’ parameter cuts the cluster dendrogram at a specific value rather than specifying a set number of clusters through the ‘k’ parameter. In this analysis, we could modify our value of ‘d’ below based on prior knowledge or what distance we want to consider a minimum for cluster membership. Here ‘d’ is set to 3x the median standard deviation in the location data when grouped by individual animal. Conversely we could use several tests to optimize the choice of k for each set of data and clustering agglomeration method. In exploratory analyses, the UPGMA or “average” method provided the best fit.
# perform clustering
p.dist <- dist(xy)
chc <- hclust(p.dist, method="average")
xy.sp <- SpatialPointsDataFrame(matrix(c(xy$medX,xy$medY), ncol=2),
data.frame(AnimalID=rownames(xy)), proj4string=crs.projection)
# Distance threshold, larger value will yield fewer clusters
# 6-7k chosen here, it's ~ axis of typical Lookout Mountain home range
#d <- 6000
d <- 3 * median(sqrt(df.m$sdX^2+df.m$sdY^2))
chc.d5k <- cutree(chc, h=d)
nclust <- max(chc.d5k)
# Join results to display sp points
xy.sp@data <- data.frame(xy.sp@data, Clust=chc.d5k)
# Cluster membership, ordered
rownames(xy.sp@data) <- NULL
clusters <- sort(unique(xy.sp@data$Clust))
members <- c(rep("",length(clusters)))
for (i in 1:length(clusters)){
membs <- xy.sp@data$AnimalID[xy.sp@data$Clust==clusters[i]]
members[i] <- str_flatten(membs,collaps=", ")
}
kable(data.frame(Cluster=clusters,Members=members),align='ll')| Cluster | Members |
|---|---|
| 1 | 23BS5651, 23BS5660, 23BS5670, 23BS5677, 23BS5707, 23BS5725 |
| 2 | 23BS5652, 23BS5654, 23BS5657, 23BS5669, 23BS5671, 23BS5672, 23BS5674, 23BS5678, 23BS5679, 23BS5680, 23BS5681, 23BS5682, 23BS5702, 23BS5709, 23BS5715, 23BS5722, 23BS5726 |
| 3 | 23BS5655, 23BS5665 |
| 4 | 23BS5658, 23BS5668, 23BS5684, 23BS5700, 23BS5723, 23BS5724 |
Evaluate our results visually:
Here we will compute home ranges for every GPS-collared animal in the herd using the ‘adehabitatHR’ package and either a bivariate normal or brownian bridge kernel function. Then, the amount of overlap between each animal (area or UD) is calculated and stored in a matrix. We’ve set the kernel function, minimum fixes, and contour level used to compute the home range from the utilization distribution in the user-input section in the head of this Markdown .Rmd
if (HRestimator=="BB"){
homeranges <- calculateBBHomerange(gps.sf,min.fixes=min.fixes,contour.percent=contour.percent, output.proj=projection)
} else homeranges <- calculateHomerange(gps.sf,min.fixes=min.fixes,contour.percent=contour.percent, output.proj=projection)[1] "Kernel function: Brownian bridge"
[1] "Total rows in GPS table for HR calculation: 4436"
[1] "Date range for HR calculation: 2023-01-18 22:00:37 to 2023-04-18 16:01:07"
[1] "Contour level is set to: 75 %"
A plot showing the amount of overlap between each pair of animals. This gets pretty messy with large numbers of individuals, so it probably makes more sense to explore the relationships between animals using this measure in a clustering algorithm.
In the last tab, we computed a matrix that contained the fraction of each animals home range (by row in the matrix) contained in every other animal in the herd (by columns). Now, treating this as a weighted adjacency matrix we can use tools from the ‘igraph’ network analysis package to map clusters viewing this data as a directed social network, with the connection between animals weighted by the amount of home range overlap. Note that in a directed network,the connection A to B can be different than B to A, which matches our data. Here we are showing the adjacency matrix clustered using a hierarchical walktrap method and displayed in two plots: 1) plot of the network and 2) a dendrogram. Note that the group colors in the network plot match the leaf text color in the dendrogram.
| Cluster | Members |
|---|---|
| 1 | 23BS5655, 23BS5665 |
| 2 | 23BS5658, 23BS5668, 23BS5684, 23BS5700, 23BS5723, 23BS5724 |
| 3 | 23BS5660, 23BS5677 |
| 4 | 23BS5651, 23BS5670, 23BS5707, 23BS5725 |
| 5 | 23BS5652, 23BS5654, 23BS5671, 23BS5678, 23BS5702, 23BS5709, 23BS5722 |
| 6 | 23BS5657, 23BS5669, 23BS5672, 23BS5674, 23BS5679, 23BS5680, 23BS5681, 23BS5682, 23BS5715, 23BS5726 |